In recent years, the expansion of Fintech has speeded the development of the online peer-to-peer lending market, offering a huge opportunity for investment by directly connecting borrowers to lenders, without traditional financial intermediaries. This innovative approach is though accompanied by increasing default risk since the information asymmetry tends to rise with online businesses. This paper aimed to predict the probability of default of the borrower, using data from the LendingClub, the leading American online peer-to-peer lending platform. For this purpose, three machine learning methods were employed: logistic regression, random forest and neural network. Prior to the scoring models building, the LendingClub model was assessed, using the grades attributed to the borrowers in the dataset. The results indicated that the LendingClub model showed low performance with an AUC of 0.67, whereas the logistic regression (0.9), the random forest (0.9) and the neural network (0.93) displayed better predictive power. It stands out that the neural network classifier outperformed the other models with the highest AUC. No difference was noted in their respective accuracy value which was 0.9. Besides, in order to enhance their investment decision, investors might take into consideration the relationship between some variables and the likelihood of default. For instance, the higher the loan amounts, the higher the likelihood of default. The higher the debt to income, the higher the likelihood of default. While the higher the annual income, the lower the probability of default. The probability of default has a tendency to decline as the number of total open accounts rises.
Loading....